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Abstract 

Background: Various computer-based methods exist for the detection and quantification of protein spots in two 
dimensional gel electrophoresis images. Area-based methods are commonly used for spot quantification: an area is 
assigned to each spot and the sum of the pixel intensities in that area, the so-called volume, is used a measure for 
spot signal. Other methods use the optical density, i.e. the intensity of the most intense pixel of a spot, or calculate 
the volume from the parameters of a fitted function. 

Results: In this study we compare the performance of different spot quantification methods using synthetic and real 
data. We propose a ready-to-use algorithm for spot detection and quantification that uses fitting of two dimensional 
Gaussian function curves for the extraction of data from two dimensional gel electrophoresis (2-DE) images. The 
algorithm implements fitting using logical compounds and is computationally efficient. The applicability of the 
compound fitting algorithm was evaluated for various simulated data and compared with other quantification 
approaches. We provide evidence that even if an incorrect bell-shaped function is used, the fitting method is superior 
to other approaches, especially when spots overlap. Finally, we validated the method with experimental data of 
urea-based 2-DE of A(3 peptides andre-analyzed published data sets. Our methods showed higher precision and accuracy 
than other approaches when applied to exposure time series and standard gels. 

Conclusion: Compound fitting as a quantification method for 2-DE spots shows several advantages over other 

approaches and could be combined with various spot detection methods. 

The algorithm was scripted in IVIATLAB (Mathworks) and is available as a supplemental file. 

Keywords: Two dimensional gel electrophoresis. Beta amyloid. Image analysis. Spot quantification 



Background 

2-dimensional gel electrophoresis (2-DE) is a powerfiil tool 
in proteomics and software-based analysis of gel images is 
an important step in the process. Various computer-based 
methods have been proposed for the detection of protein 
spots in 2-DE images [1]. Recent developments involve 
segmentation-oriented methods, which are based on the 
watershed transformation [2,3] or surface-oriented methods 
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[4]. However, a general concern regarding current spot 
detection and quantification algorithms is that no method 
is fully automated, because they require the use of user- 
defined parameters [1]. Often, more complex methods 
require additional input parameters [5] and thus can be 
difficult to implement for a user less experienced in 
image processing. This difficulty and other factors might 
be the reasons why the automated image analysis of elec- 
trophoretic data is not yet standard in all fields of re- 
search. Despite numerous solutions for automated 2-DE 
analysis have been introduced [1], the majority of these 
approaches focus on spot detection rather than on spot 
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quantification. The amount of marked protein in an elec- 
tro chemiluminescent spot is proportional to the number 
of photons detected by the camera in the respective pixels, 
which is proportional to the intensities of those pixels. 
Therefore, to obtain measures for protein quantities, one 
has to determine the overall signal amount of every spot. 
Common methods for spot quantification are area-based 
approaches that quantitate spots based on the areas they 
occupy in the image. There are various published methods 
for determining these spot areas [3,4,6,7], many of which 
make use of the watershed transformation. Spots can be 
quantitated by various area-related measures and many 
methods suggest the use of volume, i.e. the sum of pixel 
intensities within the respective areas. However, as we 
show in this paper, area-based approaches face certain 
limitations, especially in analyzing superimposed spots. 
The optical density (OD), i.e. the intensity of the most 
intense pixel of a spot, can also be used to quantify 
spots [8] . A different quantification approach is to find a 
fitted function for each spot and calculate the volume 
under the surface (VUS) of the function curve. Fitting of 
2D-Gaussian function curves has been used for the 
quantification of 2-DE spots for a long time [9,10], des- 
pite the fact that the adequate functions to model 2-DE 
spots continue to be contentious [1,11]. Recent develop- 
ments include a method to improve the applicability of 
Gaussian functions for saturated spots [12] or the use of 
2D-gaussian function curves in the creation of synthetic 
gel images for the evaluation of image analysis algorithms 
[13]. In this study, we propose a simple algorithm that fits 
Gaussian functions to detected spots. We introduce the 
novel concept of compound fitting, i.e. the simultaneous 
fitting of neighboring groups of spots, for computationally 
efficient resolution of overlapping spots. Furthermore, 
we conduct the, to our knowledge, first structured com- 
parison between different spot quantification approaches, 
namely the use of optical density of a spot, area-based 
quantification and Gaussian function fitting. The proposed 
compound fitting method is a highly accurate system for 
spot quantification and we demonstrate its superior per- 
formance relative to other methods for both synthetic and 
real data. 

Regarding real 2-DE data, we chose beta amyloid (Ap) 
peptides as our primary study system for two reasons: 
Firstly, 2-DE is a valuable tool in the analysis of APpeptides. 
Although it is difficult to separate low concentrated beta 
amyloid (A^) peptides based only on their almost identi- 
cal masses, their high content of hydrophobic amino 
acids has been used to efficiently separate Ap species 
with single amino acid differences in urea gels according 
to their hydrophobicity [14]. However, efficient separ- 
ation can only be achieved using 2-dimensional gel 
electrophoresis because many species have identical 
hydrophobicities but different isoelectric points [15]. 



The use of 2-dimensional electrophoretic methods has 
the advantage of displaying more than 30 different 
peptides at very low concentrations [15], that in surface 
enhanced laser desorption/ionization time-of flight 
mass spectroscopy are not detectable [16]. Secondly, 
exact quantification is especially important in the field 
of Ap peptides, because it is necessary not only for the 
analysis of differential expression in different experi- 
mental conditions, but also for the use of APpeptides 
as biomarkers. The quantification of single Ap peptide 
species and their ratios, e.g. Api.42 and APi_4o in particular, 
is valuable for the neurochemical diagnosis of Alzheimer's 
dementia [16]. High-quality quantification is crucial to use 
these peptides as biomarkers [17]. 

Methods 

Spot detection and compound fitting algorithm 

Our algorithm comprises two major components: spot 
detection/identification and spot quantification. The flow- 
chart of the algorithm is shown in Figure la. Spot detec- 
tion (step 2 and 3) was performed using a simplified 
version of an established method [18,19]. The algorithm 
involves a user-defined parameter w, which is used as the 
minimum allowed distance between peaks. A peak is de- 
fined as the most intense pixel of a spot. For 2-DE spots, 
the peak is usually in the center of a spot. Thus, w must 
be chosen to be smaller than the distance between two 
spots but sufficiently large to prevent the detection of 
several peaks on one spot. For the application of the al- 
gorithm on real data, the following user-friendly and in- 
tuitive method for selecting the value of w may be 
employed: the user is asked to click on the peaks of the 
two clearly identifiable spots with the smallest inter- 
peak distance (IPD). 
Based on this input, w is calculated as 

'IPD 



where [ ] is the Gaussian floor function because w must 
be an integer. If w is set/calculated too small, several 
peaks might be detected on top of a broad spot. If w is 
estimated too large, the detection might fail to resolve 
overlapping spots. 

Estimation of background noise 

The first step of the spot detection algorithm is the esti- 
mation of the standard deviation (SD) of the image noise 
Ons, which is performed in an area where only back- 
ground signal is detected. Usually such an area without 
protein spots can be found in any typical 2-DE image, 
often near the edges. We used the following approach to 
implement the estimation. 
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Figure 1 Schematic illustration of the image analysis procedure, (a) Flowchart of the algorithm, (b) Representative Image of 2-dimensional 
urea gel electrophoresis of AP peptides from human plasma, (c) Detected spots outlined, (d) Fitting compounds are indicated, (e) Fit result 
(image calculated from the fitted functions), (f) Residual image; The same level of contrast is used for all images (b) - (f). 



The entire image A(x,y) is rastered with a quadratic w 
by w pixel grid. For every pixel (x,y), the SD aareaix:y) and 
the mean value l^area(x,y) of the surrounding why w area's 
pixel intensities are calculated. fiarea(x,y) of a pixel in a back- 
ground area should be less than the mean of all fiarea{x>y), 
because the majority of a gel image is occupied with back- 
ground. Thus, the mean value of ffareaiXiy) that also satisfies 

y) < meanij{fi^^^^{ij)) is taken 
Image preparation 

Like any digital image, 2-DE images suffer from various 
types of noise and imperfections. We implemented noise 



reduction and image restoration for the spot detection 
procedure, whereas the function fit was performed using 
the original image A(x,y). 
The following operations were performed: 

- the local background was modeled using a 
boxcar average over a square region of edge 
length 2 - w + 1, 

- random digital noise was suppressed by convoluting 
the image using a Gaussian surface of standard 
deviation a/2 and 

- the restored image was calculated by subtracting 
the background image from the noise-reduced 
image. 



Brauner ef al. BMC Bioinformatics 2014, 15:181 
http://www.biomedcentral.com/1471 -21 05/1 5/181 



Page 4 of 1 2 



These steps can be performed in one convolution. The 
restored image, which is the basis for further spot detection, 
was calculated as 



Aresix,y) = ^ ^A{x-i,y-j)K{iJ), 



where the convolution kernel K, normalization constant 
/<o and factor B are given as follows: 



Ka 



1 / e+f 

- — 



1 



B 



B 



J2 exp 



{2w + iy 

B 



(2w+ 1) 



E 



exp 



Peak detection 

In the following section, we will use the term 'peak' to 
refer to the most intense pixel of a protein spot, usually 
in the center of the spot. In the restored image, a par- 
ticular pixel was considered to be the location of a peak 
if it met two conditions: 

a) It was the brightest pixel within the distance of w. 

b) Its intensity was greater than a threshold defined by 
a^is and the local factor which is calculated as the 
mean intensity of all pixels within the distance of w. 

a) This criterion is efficiently computed with a 
grayscale dilation of the image with a flat structuring 
element of circular shape and radius w. 

The dilated image -A^// is given as 

Adii{x,y) = max^.2^.2^^2^{Ares{x + i,y + J)). 

Consequently, pixels {Xp,yp) that satisfy 
Ares (xp , Jp) = Ada (xp . ypj 

are potential peak candidates. 

b) After the peak candidates have been identified, a 
pixel {Xp,yp) is considered a peak if it satisfies 



A(xp,ypJ > ^^(xp,ypj + t-a„ 



where 



Based on our experience, t = 10 yields good results for 
real 2-DE images. For t < 10, a higher rate of false- 
positive detections is expected, especially for relatively 
large image sizes. If t is set too large, the rate of false 
negatives might increase. 

Determination of the fitting compounds 

For every spot, a Gaussian function is fitted in 4 parame- 
ters (for the equation, see next paragraph). For an as- 
sumed number of 50 spots per image, fitting the entire 
image at once would mean calculating the (nonlinear) 
influence of 200 parameters for every pixel, which is not 
possible within a reasonable computation time. To solve 
this problem, we used a compound fitting process that 
only fits neighboring groups of spots within the sur- 
rounding area of pixels simultaneously. 

The general concept is that the spots of an image are 
divided into compounds - in such a way that the spots 
from one compound do not affect the pixels of another 
compound area. The defined groups of spots can then 
be fit individually within their respective compound 
areas, thereby significantly reducing the necessary com- 
putation time. 

In the following, the term "compound" will be used 
for a group of spots that are fitted at the same time. The 
term "compound area" refers to the area that surrounds 
all the spots from one compound, comprising all the 
pixels that are significantly influenced by the spots and 
therefore the area that must be considered to be able to 
perform a proper fit. 

The compound area around a single spot is a quadrat 
of edge length d. The minimum size of d is calculated 
in the results section. If two spots are close to each 
other in such a way that their compound areas would 
overlap, the overlapping area is influenced by both 
spots, and therefore, these spots must be fit at the same 
time. In this case, their individual compounds need to 
be fused, which means that if any two peaks P and Q 
satisfy |xp - jicqI <d and \yp - yql <d, they always belong 
to the same compound. This definition is compatible 
with the fact that in a single fitting compound, pairs of 
peaks can occur that do not satisfy this condition. For 
example, if there are three peaks P, Q, and R with 



Xq I ^d^ 



yp-yq 



<d, \xQ-xii\<d and yq-yn id, 



there will be only one fitting compound for all three spots, 
even if \xp - Xk\ > d. 

In this case, peak P and Q are called connected, whereas 
P is called connected to R via Q. 

A reasonable approach for the determination of the 
compounds is to calculate the adjacency matrix: 

The peaks are numbered, where N is the number of 
peaks/spot. 
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has a size of NxN and is defined as 



1, if \xi-Xj\<d and 
0, otherwise 



yryj 



<d 



where Xi,yi,Xj,yj are the pealc coordinates of the respective 
spots. 

The subscripts indicate that h^norm is the adjacency 
matrix of first order and it is normalized so that the entries 
are either 0 or 1. The entry of Ii,norm at a certain position 
(P,Q) has the value 1, if the peaks P and Q are connected. 
Otherwise, the entry has the value 0. Of course, this con- 
straint also means that all diagonal elements are 1. 

Next the square of l\,norm is calculated 

T — J 2 

^2 — ^ l,norm 

and thus 

N 



h{ij) — 'y^Jl.normji, k) ■ 1 1 ,norm{k , j) ■ 



h,norm IS 3 matrix identical to I2 but with all elements 
larger than 1 set to 1. Thus, 



^ l.norm 



1, if h{Li)>l 
0, if hiij) = 0 



If there is pair of (P,Q) with l2,norm{P> Q) = !> then there 
is a value A:i e 1 ...N that satisfies 

(/ci,Q) >0. 

Consequently, 



and 



and therefore, peak p is connected to peak q via peak k^. 

For the matrix l2,normt an entry of value 1 at a certain 
position (P,Q) indicates that peaks P and Q are con- 
nected via at most one other peak, whereas the value 0 
indicates no connection. 

This procedure of squaring and normalizing is repeated 



until 4 



norm ^{h - l),norm 



and the various fitting com- 



pounds containing the respective spots can be read out. 

Nonlinear least squared error fit of 2-dimensional Gaussian 
distribution curves in respective compounds 

A Gaussian function of circular shape is characterized by 



f{x,y) = I- exp -0.5 



{x-xo) + (j-Jo) 



This leaves four parameters: the x- and y- coordinates 
of the peak (xo and yo), the spot's intensity / and its 
width/standard deviation a. 

A nonlinear least-squared error type of fit is used. Be- 
fore the fit, the mean value of the background fibg must 
be calculated. Because all areas having a spot signal are 
enclosed by compound areas fibg can simply be calcu- 
lated as the mean of the intensity values of all pixels that 
are not included inside a compound area. Therefore, the 
function that is actually fit to model the data is 



f{x,y) = /• exp -0.5 



{x-xq) +{y-yo) 



with the 4 aforementioned parameters. 

After determining the fitting compounds, each com- 
pound is fit individually; the spots of one compound are 
all fit at the same time, within the respective compound 
area. 

The volume under the surface (VUS) of a Gaussian 
function curve, which is proportional to the number of 
photons detected by the camera and thus to the amount 
of protein at a certain location, can be calculated using 
the results from the fit via 



VUS 



00 00 

1^ 



exp -0.5' 



: (x-xo) + iy-yo) 



dx dy 



Discrimination of false positives 

Because spots can vary greatly in their size and intensity, 
it is difficult to define criteria for deciding whether a de- 
tected spot is a true or a false positive (i.e., whether a de- 
tected spot corresponds to a real spot). However, spots 
that are fitted with a very small width/ standard deviation 
may represent camera noise because certain types of cam- 
era noise usually affect single pixels that subsequently form 
peaks without a broad base. 
Thus, spots with tr < 1 can be discriminated. 

Comparison of different spot quantification methods 

When we applied the three quantification methods to 
simulated gel images containing two spots P and Q, each 
method yielded different values signalp and signalq, which 
correspond to the amount of protein signal in spot P and 
spot Q, respectively. 

For the OD-method, signalp and signalq represent the 
intensity of the brightest pixel of the respective spot; for 
the area-based method, they represent the sum of the 
pixel intensities in the respective spot areas. For our 
compound fitting method, these values represent the 
VUS of the curves calculated using the function parame- 
ters obtained by compound fitting. Consequently, there 
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was no use in evaluating the different approaches on the 
basis of the absolute values of signalp and signalQ. In- 
stead, it was necessary to consider the quantity { ^signai 'i^ • 

because it shows how a method calculated the ratio of 
protein signal between the two spots. For the simulated 
spots, the true signal ratios can be calculated from the 
function parameters used for simulation. 

This provided a measure a of the quality of the 
quantification, which was defined as the deviation of 
the calculated signal ratio from the true signal ratio: 

For further information on the different models and 
parameters used for data simulation, refer to Additional 
file 1. 

Availability of supporting data 

A ready-to-use version of this algorithm with graphical 
user interface as well as a command-line version was 
implemented for MATLAB (Mathworks). The code is 
included to the article in Additional file 2. 

Results 

The compound fitting algorithm 

We developed an algorithm for the detection and quantifi- 
cation of protein spots on two dimensional gel electro- 
phoresis (2-DE) images. Spot quantification is performed 
by fitting of 2-dimensional Gaussian function curves to 
the spots and calculating the volume under the surface 
(VUS) of the fitted functions. Before fitting, the spots are 
divided into fitting compounds. Only groups of neighbor- 
ing spots, that show some grade of overlapping, are fitted 
simultaneously. The different compounds are fitted se- 
quentially, in order to reduce the calculation time. For the 
determination of the minimal required compound area 
size, an evaluation of the fitting performance on asymmet- 
ric spots and an comparison regarding computation time 
between compound fitting and common fitting proce- 
dures, please refer to Additional files 3 and 4. 

Comparison of different approaches of spot 
quantification using synthetic data 

Next, we compared compound fitting with other quantifi- 
cation methods. We chose to compare the following three 
methods for spot quantification: 

Method 1: Optical density (OD) 

The intensity of the most intense pixel of a spot, i.e. the 
height of the peak, is used as the measure of signal 
quantity. This approach is used in recent approaches 
[8,20] and has been reported to be a better [21] as well 
as a worse [22] measure of spot quantity than the area- 
based volume. It has been argued that OD is suitable for 



the resolution of overlapping spots, because it only takes 
into consideration the peak/center of a spot, where the 
contribution of overlapping parts of neighboring spots is 
expected to be small [23]. 

Method 2: Area-based approach 

In an area-based approach, a certain area of the image 
that is assigned to each protein spot is used as the base 
for quantification. This is the most common method for 
spot quantification and usually the sum of the intensities 
of the pixels in an 'area', the so called 'volume', is used as 
the measure for spot signal [4,6,7,20]. Our example algo- 
rithm determined these areas according to the following 
scheme. First, the area of the protein signal was deter- 
mined using all pixels that surpassed a certain intensity 
threshold. Assigning areas to spots always requires a 
somewhat arbitrary threshold because spots usually are 
continuous structures without abrupt edges that would 
justify borders of the area. If two spots were present in 
one area, it had to be decided which pixel belonged to 
which spot. For this decision, the minimum of the line 
profile between the two spots (P,Q) was calculated, and a 
perpendicular to the inter-peak line was drawn through 
the minimum; pixels on one side of this perpendicular 
were assigned to spot P, and pixels on the other side 
were assigned to spot Q. Finally, the 'volume' was calcu- 
lated by summing the pixel intensities of each area. 

Method 3 Fitting of 2-dimensional Gaussian function curves 
to determine the parameters 

The three methods are illustrated in Figure 2a-d. 

The quantification of overlapping spots 

Having defined the three different quantification methods 
and a measure for the quality of quantification a (see 
'Material and methods'), we compared the three methods 
regarding their performance in the quantification of 
overlapping spots. Despite the fact that the use of 2-D 
Gaussian functions to model protein spots is disputed 
[l,ll],a function fit as a way to quantify protein spots has 
some advantages regarding spot superposition. In super- 
imposed areas, one pixel can be influenced by several 
spots. While area-based methods can only assign such a 
pixel to one spot and thereby neglect its amount of inten- 
sity related to other spots, spot superposition can be re- 
solved by a function fit. For the comparison we simulated 
gel images containing two spots with a varying grade of 
superposition. Taking into account that the use of 2-D 
Gaussian functions as a model for protein spots is dis- 
puted, we simulated not only Gaussian spots but also 
spots with a shape based on Lorentz curves or based on a 
diffusion model.The former have been discussed as a func- 
tion that models the shapes of spots in 1-D electrophoresis 
gels [24] and the latter have been suggested as another 
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Figure 2 Comparison of the different quantification approaches on simulated data sets - spot superposition, (a) Line profile of the inter-peal< 
line of two superimposed simulated spots of Gaussian shape. Dashed lines: superimposed curves. Solid line: resulting signal, (b) - (d) Depiction of 
measures for spot quantification yielded by the three methods (b) The OD-based approach uses the heights of the spots for quantification (c) The 
area-based approach estimates the volumes of the spots over the assigned areas. In this two-dimensional scheme, the volumes are depicted as areas 
P and Q. (d) In our method, the function parameters of the fitted curves are used to calculate the VUSs of the spots. (e,g,i) Performance of the different 
quantification methods for two representative Gaussian-shaped (e), Lorentz-shaped (g) or diffusion model-based (i) spots at different extents of spot 
superposition. The error of quantification a (as defined in the methods section) increases with decreasing IPD for all quantification approaches. 
However, compound fitting is most robust to spot overlapping. At IPD = 2 ■ Og, an increasing deterioration can be noticed (arrow, see text). 200 
images per data point. (f,h,k) Our method yielded a significantly smaller error of quantification than the two other approaches for Gaussian-shaped 
(f) and Lorentz-shaped (h) spots at small (IPD = 2oq, 1 .6sq or 1.7Dq) and large IPD (IPD = 3.5oq, 3sq or 3.5Dq), but only at small IPD for diffusion 
model-based spots (k). (I) Example of simulated image area containing several Gaussian spots with a range of parameters (m) Performance of 
the three quantification methods on simulated images with an increasing density of Gaussian spots. Qmean a measure for quantification similar 
to a, adapted for the evaluation of several spots. (***: p < 0.001, unpaired two-sample t-test with Bonferroni correction). For details on data 
simulation, refer to Additional file 1. 



potential function modeling 2-DE spots [11]. For all three 
simulated spot shapes, the results were similar. The OD 
miscalculated the spot ratio even at high IPD, but the re- 
sult was relatively robust to spot overlap. The area-based 
approach yielded good results for high IPD values, but 
failed to resolve overlapping spots. Only compound fitting 
yielded satisfactory results for high and low IPD values, 



not only for Gaussian shaped (Figure 2e,f), but also for 
Lorentz-shaped (Figure 2g,h) or diffusion model-based 
(Figure 2i,k) spots. Importantiy, the same level of precision 
of our method when applied to Gaussian spots could 
be achieved neither for Lorentz-shaped spots nor for 
diffusion model-based spots. With increasing spot super- 
position, there was a sudden large increase in a for the 
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area-based approach, as shown with an arrow. At this 
point, the two spots were superimposed in such a way that 
there was no valley present on the line profile from peak P 
to peak Q, but the line profile was a monotonically de- 
creasing curve and Q was the minimum. This led to fur- 
ther deviations in the assignment of the areas because the 
line used to separate the areas corresponding to each spot 
was drawn through the peak coordinates of Q. Remark- 
ably, spot detection in these cases still succeeded to de- 
tect two separate spots, because peak detection was 
not performed on the original but rather on the re- 
stored image A^es (see Image preparation). 

The ability of a method to correctly quantify overlap- 
ping spots directly influences the method's ability to 
quantify areas with a high spot density. Indeed, when we 
simulated images with an increasing number of Gaussian 
spots of various parameters (Figure 21), the result was 
similar to the studies with only two spots (Figure 2m). 
The area-based approach yielded good results for low 
spot density, but the quality deteriorated for higher spot 
densities. The OD misestimated spot ratios, but was af- 
fected less than the area-based approach by increasing 
spot density. Compound fitting yielded good quantifica- 
tion for low and high spot density. 

In summary, compound fitting proved to be a valid 
method for the quantification of overlapping spots, while 
area-based approaches faced problems especially at low 
inter peak distances. This effect was also observed for 
not-exactly Gaussian shaped spots. 

The quantification of spots with different intensities 

Furthermore, we compared the methods regarding their 
performance in the quantification of spots of different 
intensities. The use of area-based approaches is ques- 
tionable because the shapes of both protein spots and 
Gaussian function curves do not have any abrupt edges 
but the intensity values decline continuously with dis- 
tance from the spot's center. Therefore, arbitrary thresh- 
olds are needed to assign a limited area to a protein spot 
without well-defined edges. As area-based approaches by 
definition evaluate spots based on limited areas, they al- 
ways neglect a certain portion of the generally infinite 
spatial extent of a spot; i.e., they omit the periphery of a 
spot. This effect generally does not pose a problem in the 
comparison of similar spots because the measure for each 
spot is affected equally, and thus, the ratio of the spots re- 
mains the same. However, this ratio can be affected when 
comparing spots of varied intensities. In our area-based 
approaches, the area corresponding to a protein spot was 
defined by all pixels that surpass a certain fixed intensity 
threshold. In this manner, larger areas are assigned to 
spots with a higher intensity. Although it may intuitively 
appear correct to assign larger areas to brighter spots, this 
method omits a larger percentage of the signal of small 



spots than that of spots with a high intensity (Figure 3a, 
inlay) and thus might compromise the comparison of 
spots with different intensities. 

To compare the three methods in the quantification of 
spots with different intensities and to test this hypothesis, 
we simulated gel images containing spots of varying inten- 
sities and compared the volume calculated by the quantifi- 
cation approaches to the true spot signal. We used 
Gaussian-shaped (Figure 3a), Lorentz-shaped (Figure 3b) 
and diffusion model-based (Figure 3c) spots in the simula- 
tion. For all three types of spots, the area-based approach 
increasingly underestimated the spots with lower inten- 
sities, in accordance with the above explanation. The func- 
tion fit overestimated the Lorentz-shaped spots and the 
diffusion model-based spots by approximately 20 percent 
or 8 percent respectively, which is readily comprehensible 
because Lorentz curves and diffusion model-based curves 
are different from Gaussian curves. However, the advan- 
tage of our method over the area-based approach was that 
our method overestimated all spots by a similar percent- 
age, therefore enabling unbiased comparisons of spots 
with different intensities. It should be mentioned that this 
problem with area-based approaches is not an issue of 
poor assignments of the areas but is inherent to the sys- 
tem of evaluating curves with an infinite spatial extent 
based on finite areas. For example, instead of defining 
the areas using an intensity threshold, one could use an 
equally sized area for every spot (e.g., a circle around 
the peak with a fixed radius). This approach would solve 
the difficulties associated with comparing spots of differ- 
ent intensities but would compromise the comparison of 
spots with different widths. 

The OD does not calculate a volume, but takes the in- 
tensity as a measure of spot signal. Therefore it could not 
be included into this experiment, because we compared 
the volume yielded by the methods to the true volume. 
However, as the parameter relevant for the OD is the 
height of the spot and this is the only parameter that is 
varied in this experiment, the OD perfecdy depicted the 
ratio between spots of varying height (data not shown). 

In conclusion, the area-based method biased the quanti- 
fication of spots with different intensities, because it omit- 
ted a larger part of the signal for small spots than for large 
spots. For Gaussian as well as not-exactly Gaussian shaped 
spots, the compound fitting of Gaussian function curves 
did not show such a tendency. 

Comparison of different approaches of spot 
quantification using real data 

We chose AjBpeptides as our primary study system for the 
evaluation of the different spot quantification methods. 
The immunoblots used differ from average 2-DE images 
in terms of spot number, with the immunoblots contain- 
ing relatively few spots. The application of our method to 
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Figure 3 Comparison of tlie different quantification approaches on simulated data sets - spots of varying height, (a) Performance of 
area-based and function fit approaches for simulated single Gaussian-slnaped spots of varying intensity. Tlie function ft yielded correct VUSs for 
all intensities, although the area-based approach increasingly underestimated the volume at lower intensities. The intensities are indicated as 
multiples of the threshold that was used for defining the areas in the area-based approach. The inlay shows representative line profiles of relatively 
high- and low-intensity spots. The red area corresponds to the periphery of the spot, which would be omitted by an area-based approach. For the spot 
with the lower intensity, the red area comprises a higher percentage of the overall signal of the spot. 200 images per data point, (b) Performance of 
the area-based and function fit approaches for simulated single Lorentz-shaped spots of varying intensity. While our method consistently overestimated 
the VUS by approximately 20 percent, the area-based approach yielded an increasing level of underestimation at decreasing spot intensities. 200 images 
per data point, (c) Performance of the area-based and function fit approaches for simulated single diffusion model-based spots of varying 
intensity. While our method consistently overestimated the VUS by approximately 5 percent, the area-based approach yielded an increasing 
level of underestimation at decreasing spot intensities. 200 images per data point. 



a large 2-DE image containing about 1000 spots and the 
comparison with the noncommercial methods BEADS [4] 
and pinnacle [8] and the commercial software package 
Melanie 7 (Geneva Bioinft)rmatics SA) can be found in 
Additional files 5 and 6. 

Precision of the different quantification methods 

We further evaluated our method on real data. To create 
a standard for the evaluation of our method, 2-D A(3 
Western immunoblot was performed on Ap standard so- 
lutions with fixed amounts of the different Ap species. 
2-DE was performed on Ap standard solutions, the pep- 
tides were transferred to PVDF membranes and detected 
by a Horseradish peroxidase coupled antibody reaction 
(n = 4). Images of the blots (Figure 4a) were taken at dif- 
ferent exposure times. 

We then tested the different quantification approaches 
on the images of the exposure time series. There were 
seven relevant spots on every blot, because the standard 
solution contained seven Ap species. For each image of an 
exposure time series, we calculated the contribution of 
each spot (relative spot signal) to the sum of the overall 
signal of all spots ("total Ap immunoreactivity") in that 
image. This is depicted for one blot in Figure 4b. Subse- 
quently, we determined for each spot the coefficient of 
variation (CV) of the relative spot signal across the expos- 
ure time series. Since the images of an exposure time 
series depict the same blot, a sound quantification method 
is expected to yield little variance of the relative spot in- 
tensities. Across all spots and blots, the mean CV of rela- 
tive signal yielded by compound fitting was 0.043 (0.016 
for the most intense, 0.065 for the least intense spot). For 
the OD and the area-based approach, the mean CV was 
0.305 (0.240 for the most intense, 0.295 for the least in- 
tense spot) and 0.298 (0.208 for most intense, 0.317 for 
least intense spot) respectively. However, some of the 



images of the exposure time series were saturated. The 
OD is clearly not suitable for the analysis of saturated 
spots, because the OD of saturated peaks will always 
be the saturation threshold (as can be seen in Figure 4b). 
We therefore repeated the analysis, taking into consider- 
ation only the unsaturated images of each exposure time 
series. For the unsaturated images, the mean CV of the 
relative signal across all spots and blots was 0.007 for 
compound fitting, 0.044 for OD and 0.140 for the area- 
based approach. 

In summary, for different images of the same blot, the 
spot intensities yielded by compound fitting have less vari- 
ance compared to OD and area-based approach. When 
saturated images were included, the performance of OD 
and the area-based approach were similar. However, when 
only unsaturated images were analyzed, OD was superior 
to the area-based approach. 

Accuracy of the different quantification methods 

Having evaluated the precision of our method, we also 
wanted to measure the accuracy of compound fitting, 
i.e. the grade to which the results display the actual pro- 
tein amounts. For each blot, we used the most intense 
image that did not show any saturation. We compared 
only the APi_37/38/39/4o/42 spots in our standard immuno- 
blots, because the primary antibody (mAb 1E8) used for 
staining binds to the N-terminus and the differential bind- 
ing of the antibody compromises the comparison of APi_x 
and AP2_xSpecies. In all four blots, we compared the rela- 
tive spot signal of an Ap species yielded by the three quan- 
tification methods to the percentage of the amount of 
that species in the standard solutions (relative peptide 
amount). For compound fitting, the mean deviation of the 
relative signal to the relative peptide amount was 3.81%, 
for OD and the area-based approach the mean deviation 
was 4.37% and 4.71%, respectively. 
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Figure 4 Application of compound fitting on real data, (a) One example blot of four blots of Af? standard solutions. Api -37/39 (30 pg, each), 
Api -38/42 (60 pg, each), Api -40 (120 pg), and Ap2-40/42 (60 pg, each), (b) For one example of the four blots, the quantification of all images of 
the exposure time series is show. The relative spot signal (contribution of the spot signal to the overall signal of all spots combined) of the spots 
1-7 was calculated in each of the 12 image of the exposure time series. One color is representative for one exposure time. It can be observed that 
compound fitting yielded the smallest variance of relative spot signal across the exposure time series, (c) 3-D surface plot of immunoblot image 
of Ap peptides from human plasma, (d) Surface plot of the image calculated from the fitted Gaussian functions, (e) Residual image, difference 
between original and fitted image, (f) Correlation of the published volumes with the VUSs calculated from the fitted functions. The relative 
contribution of each spot to the over-all signal of the image ("total Ap immunoreactivity") is displayed. The published relative volumes for the 30 
spots correlate well with our results, with a spearmancorrelation coefficient of p = 0.87. The black curve is the angle bisector, (g) Comparison of 
published volumes and calculated VUSs as a function of the spot intensity. With respect to the 8 most intense spots (red), our method yielded 
smaller relative VUS values than were published for 6 out of 8 spots, and higher values were obtained for 2 spots. In contrast, of the 22 lower-intensity 
spots (blue), our method calculated lower VUSs relative to the published data only for one spot, whereas it found the other 21 spots to be more 
intense than the published data. 



This means that compound fitting has a 13% or 19% 
higher accuracy than OD or the area-based approach, re- 
spectively. However, one has to be cautious when compar- 
ing the protein amount with the spot signal. Differential 



binding of the antibody, solubility of the different species, 
possible oxidation reaction and many more variables can 
influence the relation between protein load of the gel and 
detected signal. 
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Comparison of compound fitting with published, area 
based, results 

Next, we applied our algorithm to immunoblot images of 
A|3 peptides in human plasma in healthy controls using 
datasets which we published previously [15]. When our al- 
gorithm was applied to the original images, we found that 
it produced a suitable fit for the images (Figure 4c-e). 

In our previous publication, 30 peptide spots were iden- 
tified, and the relative contribution of each spot to the 
sum of the overall signal of all counted spots ("total A(3 
immunoreactivity") was calculated using an area-based ap- 
proach (Quantity One software v4.1, BioRad). The relative 
spot signals calculated by our method closely matched the 
published values (spearman's rho p =0.87, Figure 4f). 

Interestingly, the differences between the published 
area-based results and the results of the fit of the corre- 
sponding images showed a systematic trend. For spots with 
a low intensity, our method yielded higher relative values 
than the area-based approach, whereas for the spots with 
the highest intensities, the area-based approach yielded 
higher values (Figure 4g). This finding is in accord 
with our simulations on spots with different intensities 
(Figure 3). In the presented data, the peak intensity 
values varied by the factor 100 from lowest to highest. 
Because low-intensity spots are underestimated by the 
area-based approach but not by the function fit method, 
the area-based approach calculates a lower contribution to 
the total A|3 immunoreactivity for such spots. Conse- 
quently, because the sum of all spots must be equal to 
100%, the area-based approach overestimates spots with a 
high intensity. 

In summary, area-based approaches might underesti- 
mate small spots in immunoblot images of Ap peptides 
in human plasma. 

Discussion 

In this present work, we validated the compound fitting 
of two dimensional Gaussian functions as a method for 
the quantification of two dimensional gel electrophoresis 
(2-DE) data, thereby enabling robust comparisons between 
different protein spots within a brief computation time. 
The fitting method was found to be valid for quantifying 
simulated data and was further evaluated using immuno- 
blots of Ap standard solutions and a published set of 2-DE 
images. The comparison of the compound fitting ap- 
proach to area-based quantification methods revealed two 
advantages of the function fit method: 1. an improved 
ability to resolve superposed spots and 2. The unbiased 
comparison of spots with greatly differing heights/widths. 
The optical density (OD), i.e. the intensity of the most in- 
tense pixel of a spot, as a measure for spot signal does not 
take into account the width of a spot and is therefore not 
suitable to depict the signal ratio between different spots. 
However, it perfectly depicts the ratio between spots of 



similar shape but different height and is relatively robust 
to spot overlapping. It might therefore be useful for differ- 
ential expression analysis across several gels, especially if 
the spots are strongly distorted. When we applied all three 
methods to exposure time series and standard gels of A|3 
peptides, compound fitting showed the best precision as 
well as accuracy. Although for our data, our method pro- 
duced suitable fits with little residues (Figure 4c-e), it has 
also been argued that protein spots are often not modeled 
exactly by Gaussian spots [11]. However, even for not- 
exactly-Gaussian spots, if the underlying function is not 
known, compound fitting may be the method of choice 
due to the aforementioned advantages, as demonstrated 
by the application of our algorithm to Lorentz-shaped or 
diffusion model-based spots. Although, as expected, our 
algorithm miscalculated those spots, it nevertheless out- 
performed the area-based approaches in terms of the 
resolution of superimposed spots (Figure 2e-m) and 
the comparison of spots of varying height (Figure 3). 
Due to these advantages, fitting of 2-dimensional Gaussian 
function curves has a higher precision (Figure 4b) and ac- 
curacy than area-based approaches when evaluated on im- 
munoblots of standard peptide solutions. 

Conclusion 

Low SNRs and heterogeneous image conditions across 
different gels are common problems in 2-DE data analysis 
[7]. We found that it was most practical to robustly detect 
peaks and then aUow the user to inspect the results and 
add or discard peaks based on experience. Spots that are 
clearly separated in one experiment may not be detected 
as separate spots in another gel. In this case, the experi- 
enced user might add or correct peak locations. However, 
because all spot detection methods have advantages and 
drawbacks when addressing a wide variety of electrophor- 
esis images [1], compound fitting as a quantification tool 
could be combined with other spot detection algorithms 
or the manual identification of spots. This combination 
and the influence of different spot detection methods on 
the quantification process are described in Additional file 7. 

With respect to A|3 peptides, high-quality quantifica- 
tion is necessary to use these peptides as biomarkers 
[17]. In addition, the exact quantification of 2-DE spots 
will continue to be crucial for the identification of differ- 
ences in the amounts of A|3 species in the plasma of pa- 
tients with Alzheimer's disease and for the development 
of additional molecular markers. 

Additional files 



Additional file 1: Supplementary methods: Data Simulation in detail. 

Additional file 2: Matlab code: Compound fitting. 

Additional file 3: Evaluation of compound fitting on synthetic data. 
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Additional file 4: Figure SI. Performance of compound fitting on 
simulated data sets, (a) Influence of compound area size and SNR on the 
quality of fit For each curve, the deviation of the fit result from the true 
VUS is displayed for varying compound area size d. For comparison, the 
curves for the different SNRs are plotted above each other The inlays 
show examples of simulated spots with respective SNR.200 images per 
data point, (b) Fvaluation of the quality of fit on asymmetric Gaussian-shaped 
spots. The deviation of the fit result from the true spot signal is displayed as a 
function of the asymmetry of the spot. The inlays show simulated spots with 
oy Oy = 1, 2 and 3, from left to right (c) Comparison of compound fitting 
and usual fitting. The computation time in seconds for compound fitting 
(red) and usual fitting (blue) is displayed in dependence of the number of 
simulated spots in the image. 200 images per data point For details on data 
simulation, refer to additional file 1. 

Additional file 5: Analysis of a large 2-DE image and comparison 
with other methods [25,26]. 

Additional file 6: Figure S2. Immunoblot of Ap peptides in human 
plasma from healthy controls. 

Additional file 7: The combination of compound fitting with 
different spot detection methods. 
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